Search CORE

11 research outputs found

Optimistic distributionally robust optimization for nonparametric likelihood approximation

Author: Kuhn D
Nguyen VA
Shafieezadeh-Abadeh S
Wiesemann W
Yue M-C
Publication venue: NEURAL INFORMATION PROCESSING SYSTEMS (NIPS)
Publication date: 23/10/2019
Field of study

The likelihood function is a fundamental component in Bayesian statistics. However, evaluating the likelihood of an observation is computationally intractable in many applications. In this paper, we propose a non-parametric approximation of the likelihood that identifies a probability measure which lies in the neighborhood of the nominal measure and that maximizes the probability of observing the given sample point. We show that when the neighborhood is constructed by the Kullback-Leibler divergence, by moment conditions or by the Wasserstein distance, then our optimistic likelihood can be determined through the solution of a convex optimization problem, and it admits an analytical expression in particular cases. We also show that the posterior inference problem with our optimistic likelihood approximation enjoys strong theoretical performance guarantees, and it performs competitively in a probabilistic classification task

arXiv.org e-Print Archive

Spiral - Imperial College Digital Repository

Semi-supervised Learning based on Distributionally Robust Optimization

Author: Balsubramani A.
Blanchet J.H.
Blum A.
Ghosh S.
Grandvalet Y.
Luenberger D.G.
Namkoong H.
Rubner Y.
Shafieezadeh‐Abadeh S.
Villani C.
Volpi R.
Xu H.
Zhu X.
Publication venue: 'Wiley'
Publication date: 20/03/2019
Field of study

We propose a novel method for semi-supervised learning (SSL) based on data-driven distributionally robust optimization (DRO) using optimal transport metrics. Our proposed method enhances generalization error by using the unlabeled data to restrict the support of the worst case distribution in our DRO formulation. We enable the implementation of our DRO formulation by proposing a stochastic gradient descent algorithm which allows to easily implement the training procedure. We demonstrate that our Semi-supervised DRO method is able to improve the generalization error over natural supervised procedures and state-of-the-art SSL estimators. Finally, we include a discussion on the large sample behavior of the optimal uncertainty region in the DRO formulation. Our discussion exposes important aspects such as the role of dimension reduction in SSL

arXiv.org e-Print Archive

Crossref

Calculating optimistic likelihoods using (geodesically) convex optimization

Author: Kuhn D
Nguyen VA
Shafieezadeh-Abadeh S
Wiesemann W
Yue MC
Publication venue: Neural Information Processing Systems Foundation, Inc.
Publication date: 03/09/2019
Field of study

A fundamental problem arising in many areas of machine learning is the evaluationof the likelihood of a given observation under different nominal distributions.Frequently, these nominal distributions are themselves estimated from data, whichmakes them susceptible to estimation errors. We thus propose to replace eachnominal distribution with an ambiguity set containing all distributions in its vicinityand to evaluate anoptimistic likelihood, that is, the maximum of the likelihoodover all distributions in the ambiguity set. When the proximity of distributionsis quantified by the Fisher-Rao distance or the Kullback-Leibler divergence, theemerging optimistic likelihoods can be computed efficiently using either geodesicor standard convex optimization techniques. We showcase the advantages ofworking with optimistic likelihoods on a classification problem using synthetic aswell as empirical data

Infoscience - École polytechnique fédérale de Lausanne

arXiv.org e-Print Archive

Spiral - Imperial College Digital Repository

Calculating optimistic likelihoods using (geodesically) convex optimization

Author: Kuhn D
Nguyen VA
Shafieezadeh-Abadeh S
Wiesemann W
Yue MC
Publication venue: NeurIPS
Publication date: 10/05/2023
Field of study

33rd Conference on Neural Information Processing Systems (NeurIPS 2019), 8-14 Dec 2019, Vancouver, Canada202305 bcchVersion of RecordSelf-fundedPublishe

PolyU Institutional Repository

Optimistic distributionally robust optimization for nonparametric likelihood approximation

Author: Kuhn D
Nguyen VA
Shafieezadeh-Abadeh S
Wiesemann W
Yue MC
Publication venue: NeurIPS
Publication date: 10/05/2023
Field of study

33rd Conference on Neural Information Processing Systems (NeurIPS 2019), 8-14 Dec 2019, Vancouver, Canada202305 bcchVersion of RecordOthersEPSRCPublishe

PolyU Institutional Repository

Wasserstein Distributionally Robust Optimization: Theory and Applications in Machine Learning

Author: Bertsekas D. P.
Bertsimas D.
Canas G.
Chatfield C.
Cover T. M.
Gao R.
Golnaraghi F.
Gulrajani I.
Hamilton J. D.
Kantorovich L. V.
Kantorovich L. V.
Kay S. M.
MacKay D. J. C.
Monge G.
Murphy K. P.
Ogata K.
Oppenheim A. V.
Rockafellar R. T.
Seguy V.
Shafieezadeh-Abadeh S.
Shafieezadeh-Abadeh S.
Stock J. H.
Villani C.
Wooldridge J. M.
Publication venue: 'Institute for Operations Research and the Management Sciences (INFORMS)'
Publication date: 28/05/2019
Field of study

Many decision problems in science, engineering and economics are affected by uncertain parameters whose distribution is only indirectly observable through samples. The goal of data-driven decision-making is to learn a decision from finitely many training samples that will perform well on unseen test samples. This learning task is difficult even if all training and test samples are drawn from the same distribution - especially if the dimension of the uncertainty is large relative to the training sample size. Wasserstein distributionally robust optimization seeks data-driven decisions that perform well under the most adverse distribution within a certain Wasserstein distance from a nominal distribution constructed from the training samples. In this tutorial we will argue that this approach has many conceptual and computational benefits. Most prominently, the optimal decisions can often be computed by solving tractable convex optimization problems, and they enjoy rigorous out-of-sample and asymptotic consistency guarantees. We will also show that Wasserstein distributionally robust optimization has interesting ramifications for statistical learning and motivates new approaches for fundamental learning tasks such as classification, regression, maximum likelihood estimation or minimum mean square error estimation, among others

Infoscience - École polytechnique fédérale de Lausanne

Crossref